Serveur d'exploration Cyberinfrastructure

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Performance, optimization, and fitness: Connecting applications to architectures

Identifieur interne : 000767 ( Main/Exploration ); précédent : 000766; suivant : 000768

Performance, optimization, and fitness: Connecting applications to architectures

Auteurs : Mohammad A. Bhuiyan [États-Unis] ; Melissa C. Smith [États-Unis] ; Vivek K. Pallipuram [États-Unis]

Source :

RBID : ISTEX:E43C7B0758A3069211F596C690F59417BBC0471F

English descriptors

Abstract

Recent trends involving multicore processors and graphical processing units (GPUs) focus on exploiting task‐ and thread‐level parallelism. In this paper, we have analyzed various aspects of the performance of these architectures including NVIDIA GPUs, and multicore processors such as Intel Xeon, AMD Opteron, IBM's Cell Broadband Engine. The case study used in this paper is a biological spiking neural network (SNN), implemented with the Izhikevich, Wilson, Morris–Lecar, and Hodgkin–Huxley neuron models. The four SNN models have varying requirements for communication and computation making them useful for performance analysis of the hardware platforms. We report and analyze the variation of performance with network (problem size) scaling, available optimization techniques and execution configuration. A Fitness performance model, that predicts the suitability of the architecture for accelerating an application, is proposed and verified with the SNN implementation results. The Roofline model, another existing performance model, has also been utilized to determine the hardware bottleneck(s) and attainable peak performance of the architectures. Significant speedups for the four SNN neuron models utilizing these architectures are reported; the maximum speedup of 574x was observed in our GPU implementation. Our results and analysis show that a proper match of architecture with algorithm complexity provides the best performance. Copyright © 2010 John Wiley & Sons, Ltd.

Url:
DOI: 10.1002/cpe.1688


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Performance, optimization, and fitness: Connecting applications to architectures</title>
<author>
<name sortKey="Bhuiyan, Mohammad A" sort="Bhuiyan, Mohammad A" uniqKey="Bhuiyan M" first="Mohammad A." last="Bhuiyan">Mohammad A. Bhuiyan</name>
</author>
<author>
<name sortKey="Smith, Melissa C" sort="Smith, Melissa C" uniqKey="Smith M" first="Melissa C." last="Smith">Melissa C. Smith</name>
</author>
<author>
<name sortKey="Pallipuram, Vivek K" sort="Pallipuram, Vivek K" uniqKey="Pallipuram V" first="Vivek K." last="Pallipuram">Vivek K. Pallipuram</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:E43C7B0758A3069211F596C690F59417BBC0471F</idno>
<date when="2011" year="2011">2011</date>
<idno type="doi">10.1002/cpe.1688</idno>
<idno type="url">https://api.istex.fr/document/E43C7B0758A3069211F596C690F59417BBC0471F/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000786</idno>
<idno type="wicri:Area/Istex/Curation">000786</idno>
<idno type="wicri:Area/Istex/Checkpoint">000187</idno>
<idno type="wicri:doubleKey">1532-0626:2011:Bhuiyan M:performance:optimization:and</idno>
<idno type="wicri:Area/Main/Merge">000769</idno>
<idno type="wicri:Area/Main/Curation">000767</idno>
<idno type="wicri:Area/Main/Exploration">000767</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Performance, optimization, and fitness: Connecting applications to architectures</title>
<author>
<name sortKey="Bhuiyan, Mohammad A" sort="Bhuiyan, Mohammad A" uniqKey="Bhuiyan M" first="Mohammad A." last="Bhuiyan">Mohammad A. Bhuiyan</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Electrical and Computer Engineering, Clemson University, Clemson, SC 29634</wicri:regionArea>
<placeName>
<region type="state">Caroline du Sud</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Smith, Melissa C" sort="Smith, Melissa C" uniqKey="Smith M" first="Melissa C." last="Smith">Melissa C. Smith</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Electrical and Computer Engineering, Clemson University, Clemson, SC 29634</wicri:regionArea>
<placeName>
<region type="state">Caroline du Sud</region>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Pallipuram, Vivek K" sort="Pallipuram, Vivek K" uniqKey="Pallipuram V" first="Vivek K." last="Pallipuram">Vivek K. Pallipuram</name>
<affiliation wicri:level="2">
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Department of Electrical and Computer Engineering, Clemson University, Clemson, SC 29634</wicri:regionArea>
<placeName>
<region type="state">Caroline du Sud</region>
</placeName>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="j">Concurrency and Computation: Practice and Experience</title>
<title level="j" type="abbrev">Concurrency Computat.: Pract. Exper.</title>
<idno type="ISSN">1532-0626</idno>
<idno type="eISSN">1532-0634</idno>
<imprint>
<publisher>John Wiley & Sons, Ltd.</publisher>
<pubPlace>Chichester, UK</pubPlace>
<date type="published" when="2011-07">2011-07</date>
<biblScope unit="volume">23</biblScope>
<biblScope unit="issue">10</biblScope>
<biblScope unit="page" from="1066">1066</biblScope>
<biblScope unit="page" to="1100">1100</biblScope>
</imprint>
<idno type="ISSN">1532-0626</idno>
</series>
<idno type="istex">E43C7B0758A3069211F596C690F59417BBC0471F</idno>
<idno type="DOI">10.1002/cpe.1688</idno>
<idno type="ArticleID">CPE1688</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">1532-0626</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Fitness model</term>
<term>GPU</term>
<term>multicore</term>
<term>optimization</term>
<term>performance</term>
</keywords>
</textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Recent trends involving multicore processors and graphical processing units (GPUs) focus on exploiting task‐ and thread‐level parallelism. In this paper, we have analyzed various aspects of the performance of these architectures including NVIDIA GPUs, and multicore processors such as Intel Xeon, AMD Opteron, IBM's Cell Broadband Engine. The case study used in this paper is a biological spiking neural network (SNN), implemented with the Izhikevich, Wilson, Morris–Lecar, and Hodgkin–Huxley neuron models. The four SNN models have varying requirements for communication and computation making them useful for performance analysis of the hardware platforms. We report and analyze the variation of performance with network (problem size) scaling, available optimization techniques and execution configuration. A Fitness performance model, that predicts the suitability of the architecture for accelerating an application, is proposed and verified with the SNN implementation results. The Roofline model, another existing performance model, has also been utilized to determine the hardware bottleneck(s) and attainable peak performance of the architectures. Significant speedups for the four SNN neuron models utilizing these architectures are reported; the maximum speedup of 574x was observed in our GPU implementation. Our results and analysis show that a proper match of architecture with algorithm complexity provides the best performance. Copyright © 2010 John Wiley & Sons, Ltd.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
<region>
<li>Caroline du Sud</li>
</region>
</list>
<tree>
<country name="États-Unis">
<region name="Caroline du Sud">
<name sortKey="Bhuiyan, Mohammad A" sort="Bhuiyan, Mohammad A" uniqKey="Bhuiyan M" first="Mohammad A." last="Bhuiyan">Mohammad A. Bhuiyan</name>
</region>
<name sortKey="Pallipuram, Vivek K" sort="Pallipuram, Vivek K" uniqKey="Pallipuram V" first="Vivek K." last="Pallipuram">Vivek K. Pallipuram</name>
<name sortKey="Smith, Melissa C" sort="Smith, Melissa C" uniqKey="Smith M" first="Melissa C." last="Smith">Melissa C. Smith</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/CyberinfraV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000767 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000767 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    CyberinfraV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:E43C7B0758A3069211F596C690F59417BBC0471F
   |texte=   Performance, optimization, and fitness: Connecting applications to architectures
}}

Wicri

This area was generated with Dilib version V0.6.25.
Data generation: Thu Oct 27 09:30:58 2016. Site generation: Sun Mar 10 23:08:40 2024